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ABSTRACT 

This  paper  describes  the  PKUTM  participation  in  the 
TREC  2010  Blog  Track.  We  only  concentrated  on 
the  Faceted  Blog  Distillation  Task  this  year.  Our  system 
adopts  a  tw’o-stage  approach  for  this  task.  In  the  first 
stage,  our  system  makes  use  of  an  IR  platform  -  indri  to 
obtain  the  top  N  ad-hoc  topic-relevant  blog  posts  for 
each  query.  In  the  second  stage,  different  models  are 
designed  to  identify  the  facet  inclination.  The 
experimental  results  show  the  effectiveness  of  our 
approach. 

1.  Introduction 

In  this  paper,  we  describe  the  participation  of  PKUTM  in 
the  TREC  20 1 0  Blog  Track.  The  Blog  track  explores  the 
information  seeking  behavior  in  the  blogosphere ,  and  it 
is  first  introduced  in  TREC  2006  [8],  with  a  main  pilot 
search  task,  namely  the  opinion-finding  task.  This  year 
there  are  also  two  tasks:  Faceted  Blog  Distillation  Task 
and  Top  Stories  Identification  Task.  The  PKUTM  group 
only  concerned  the  Faceted  Blog  Distillation  Task.  The 
PKUTM  system  is  based  on  the  indri  [6]  framework  and 
it  makes  use  of  a  two-stage  approach.  The  system  first 
retrieval  the  top  N  topic-relevant  blog  posts  and  then 
analyzes  them  for  the  three  facet  inclination 
identification  sub-tasks,  respectively.  For  the 
opinion/factual  facet,  our  system  uses  two  different 
ranking  strategies  and  a  novel  opinion  retrieval  model. 
For  the  personal/official  facet,  the  facet  is  predicted 
based  on  the  proportion  of  pro-nouns,  the  presence  of 
named  entities  and  offensive  words.  For  the 
In-depth/shallow  facet,  the  facet  is  considered  closely 


related  to  the  proportion  of  the  regular  words  according 
to  the  word-building  rules. 

2.  Collection  and  Preprocessing 

The  TREC  blog08  collection  consisting  of  pennalinks, 
feeds  and  blog  homepages  is  again  used  in  TREC  2010. 
We  used  only  the  pennalinks  in  the  Faceted  Blog 
Distillation  Task.  The  pennalinks  encoded  by  HTML 
contain  relevant  content  and  many  inelevant  contents 
such  as  HTML  tags,  advertisements,  site  descriptions 
and  menus.  For  the  Faceted  Blog  Distillation  Task,  the 
irrelevant  contents  are  noises.  Thus,  we  have  to  extract 
the  relevant  content  from  the  pennalinks.  A  simple  but 
effective  algorithm  is  proposed  to  get  the  relevant 
content.  We  first  assume  the  content  that  invariably 
appears  in  each  post  of  a  certain  feed  is  irrelevant 
[3]. Then,  we  regard  most  of  the  hyperlinks  as  another 
irrelevant  content,  for  example  the  advertisements. 
However,  not  all  of  these  hyperlinks  are  irrelevant.  The 
algorithm  proposed  by  us  can  identify  the  two  kinds  of 
irrelevant  contents  above  effectively. 

For  example,  let  p;  be  any  blog  post,  and  let  pj  be  the 
blog  post  whose  blog  feed  is  the  same  as  p;. 

Noise(pi ) 

=  (Content (pi)  n  Content(pi')) 

+AdHyperlink(pi ) 

where  Noise  (pi)  denotes  the  irrelevant  content  of 
Pi,  Content(pi )  and  Content(pj )  denote  the  content  of 
Pi  and  p;  respectively,  and  AdHyperlink(pi')  denotes 
the  irrelevant  hyperlinks  of  p;.The  idea  that  identifies 
the  irrelevant  hyperlinks  is  similar  to  [11].  Finally, 
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Noise(j>i )  and  HTML  tags  are  filtered  out  from  p;. 

In  addition,  we  find  that  the  comments  can  lower  the 
accuracy  in  the  opinion/factual  and  official/personal 
facet  inclination  identification  sub-tasks.  Due  to  the 
lacking  of  common  method  to  remove  comments  over 
different  web  sites,  we  simply  make  use  of  the  first  part 
of  the  blog  post  instead  of  the  whole  in  these  two 
sub-tasks. 

3.  Faceted  Blog  Distillation  Task 

In  this  section,  we  describe  our  approaches  for  the 
Faceted  Blog  Distillation  Task  in  detail. 

3.1.  Topical  blog  distillation  sub-task 

Firstly,  we  have  to  obtain  the  top  N  ad-hoc  topic -relevant 
blog  posts.  In  our  system,  we  set  N  as  10000. 

3.1.1.  Query  Expansion 

The  topics  of  TREC2010  contain  five  fields  , namely 
‘ num ’,  ’query’,  ’desc’,  ’facet’  and  ’narr’.  We  consider 
that  i desc ’  and  ‘narr’  are  helpful  to  retrieve  the 
topic-relevant  blog  posts.  We  design  a  simple  but 
effective  algorithm  to  extract  the  useful  words  from  the 
two  fields  which  can  be  used  to  expand  the  query.  Query 
expansion  effectively  deals  with  the  word  mismatch 
problem  caused  by  the  short  queries.  Since  queries  for 
the  Faceted  Blog  Distillation  Task  are  usually  short,  we 
expect  that  query  expansion  could  play  an  important  role 
for  improving  the  performance  of  topic-relevant  retrieval 
For  example,  below  is  one  sentence  of  Topic  1103  in 
TREC  2009. 

I  want  to  find  blogs  about  farm  subsidies  in  the 
United  States. 

We  regard  the  words  of  farm’,  ’subsidies’,  ’United’ 
and  ’States’  as  useful  information,  and  the  remaining 
words  such  as  ‘want’  find’  are  useless  for  retrieving 
topic-relevant  blog  posts.  It  is  easy  to  summarize  this 
conclusion  that  the  nouns  of  a  sentence  are  probably 
useful  words.  So  our  algorithm  extracts  all  the  nouns  and 
noun  phrases  of  the  sentences.  Finally,  some  stop  words 


were  removed  from  them.  The  Stanford1  Parser  and 

2 

Tregex  are  used  to  get  the  nouns  and  noun  phrases  from 
the  parser  trees. 

3.1.2.  Baseline 

In  the  baseline  stage,  we  submitted  two  baselines  as 
follows: 

1.  PKUTMB1  is  an  automatic  ‘ query-only ’  run  which 
is  compulsive  in  TREC  20 10. In  this  run,  participants 
are  allowed  to  use  only  the  ‘query  ’  field  of  the  topic. 
Since  indir  [6]  support  structure  query  language,  an 
example  query  of  this  run  for  Topic  1103  is  as 
follows: 

<query> 

#weight(l .0  farm  1.0  subsidies 
2.0  #l(farm  subsidies) 

1.5  #uw5(farm  subsidies)  ) 

</query> 

2.  PKUTMB2  is  also  an  automatic  run.  The  query  of 
this  run  consists  of,  apart  from  the  ‘query’  field  of 
the  topic,  the  expansive  words  which  are  given  by 
algorithm  3.1.1.  For  instance,  the  query  for  Topic 
1103  is  as  follows. 

<query> 

height (1.0  farm  1.0  subsidies 
2.0  #l(farm  subsidies) 

1.5  #uw5 (farm  subsidies) 

0.8#uw  5  (united  states) 

0.5  #combine  (united  states  farmers 
government  farmers  products  )) 

</query> 

Since  the  ranking  unit  of  the  Faceted  Blog  Distillation 
Task  is  blog  feed,  we  need  to  obtain  the  topic -relevant 
score  of  each  blog  feed.  The  feed’s  topic-relevant  scores 
of  the  above  two  baselines  are  both  calculated  as 
follows: 


1  http://nlp.stanford.edu/software/lex-parser.shtml 

2  http://nlp.stanford.edu/software/tregex.shtml 


Score  R  ( Feed ) 


\FeedTop  \ 

=  ( MaxpeFeedToP<iScore(p )))  x 

Where  ScoreR(Feed)  is  the  topic-relevant  score  of 
Feed,  Score  (p)  is  the  indri  ’s  retrieval  score  of  blog  post 
p  belonging  to  Feed,  and  \FeedTop\  and  \Feed\  are 
the  numbers  of  corresponding  blog  posts  in  the  Top  N 
collection  and  the  whole  blog08  collection,  respectively. 

3.2.  Facet  inclination  identification  sub-task 

In  this  section,  we  introduce  our  models  of 
opinion/factual,  personal/official  and  In-depth/shallow 
facet  inclination  identification  sub-tasks  respectively.  In 
this  second  stage,  we  applied  our  facet  models  on  these 
blog  posts  retrieved  in  the  first  stage. 

3.2.1.  Opinionated  vs.  Factual  Model 

As  we  aim  to  find  the  blog  feeds  which  are  not  only 
interested  in  a  given  topic,  but  also  make  opinionated 
expressions  on  this  topic,  we  adopt  two  different  ranking 
strategies  -  Average  Strategy  and  Maximum  Strategy, 
and  a  novel  Opinion  Retrieval  Model  to  solve  this 
problem.  These  approaches  are  all  based  on  the 
presence  of  sentiment  words. 

1.  Sentiment  Lexicon 

For  the  opinion  facet  identification  sub-task,  we 
constructed  our  own  sentiment  lexicon  based  on  the 
following  lexicons. 

SetntiWordNet 

SentiWordNet  [2]  is  a  lexical  resource  for  opinion 
mining.  SentiWordNet  assigns  to  each  synset  of  WordNet 
three  sentiment  scores:  positivity,  negativity,  objectivity. 
We  can  get  the  opinion  score  of  each  synset  by  summing 
the  positivity  score  and  negativity  score.  For  one  word  , 
if  any  opinion  score  of  the  synsets  that  this  word  belongs 
to  is  not  smaller  than  o.  6  ,we  add  it  to  our  own  sentiment 
lexicon. 


HowNet 

HowNet[  1]  is  a  knowledge  database  of  the  Chinese 
language,  and  some  of  the  words  in  the  dictionary  have 
positive  or  negative  properties.  We  use  the  English 
translation  of  those  sentiment  words  provided  by 
HowNet.  There  are  1001  negative  sentiment  words  and 
769  positive  sentiment  words.  Since  the  HowNet  words 
do  not  have  opinion  scores,  we  simply  assign  0.8  to  each 
word  as  its  opinion  score.  Besides,  there  is  an  opinion 
operator  lexicon  in  HowNet.  Following  [9],  we  consider 
that  operator  words  such  as  ‘advocate’,  ‘believe’  are 
import  clues  for  the  sentences  which  contain  the  author’s 
opinion.  We  simply  assign  1.0  to  each  operator  word  as 
its  opinion  score 

OpinionFinder’s  Subjectivity  Lexicon 

The  Subjectivity  Lexicon  [10]  is  compiled  from  manually 
annotated  corpus  MPQA  which  contains  a  wide  variety 
of  news  articles.  The  words  in  the  Subjectivity  Lexicon 
have  been  labeled  with  part  of  speech  tags  as  well  as 
either  strong  subjective  or  weak  subjective  tags 
depending  on  reliability  of  the  subjective  nature  of  the 
word.  We  use  only  the  strong  subjective  words  in  this 
task.  The  words  in  the  Subjectivity  Lexicon  do  not  have 
opinion  scores.  Since  this  lexicon  is  constructed 
manually,  we  consider  this  lexicon  is  more  reliable.  So 
we  assign  1.0  to  each  word  as  its  opinion  score. 

Indicator 

Following  [9], we  regard  opinion  indicator  words  such  as 
‘would’ ,  ’should’ ,  as  another  significant  clues  for  the 
author’s  opinion.  We  chose  9  indicators  (e.g.  ’ 
would’,  ’could’,  ’pity’,  ’should',  ’might’,  ’maybe’, 
but’,  ’  in  fact’,  ’consequently  j  which  can  get  higher 
precision  on  the  data  of  TREC  2009.  We  simply  assign 
1.0  to  each  indicator  as  its  opinion  score. 

Finally,  we  remove  1326  sentiment  words  of  SetntiWor¬ 
dNet,  HowNet  and  OpinionFinder  ’s  Subjectivity  Lexicon 
which  get  lower  precision  on  TREC  2009  data. 


2.  Opinion  Scoring 

For  one  blog  post,  the  opinion  score  is  computed  as 
follows: 

Scoreopin(post )  =  wtf(post,t)  x  wop(t) 

Where  Scoreopin(post)  stands  for  the  opinion  score  of 
a  blog  post,  wtf(post,t )  denotes  the  term  frequency  of 
opinion  word  f  in  the  blog  post,  and  wop(t) 
corresponds  to  the  opinion  score  of  word  t. 

Similar  to  3.1.2,  we  also  need  to  calculate  the 
opinion/factual  score  of  a  blog  feed.  Two  different 
strategies  are  used  for  computing  the  blog  feed's 
opinion/factual  score. 


Average  Strategy  (AS) 

Under  this  strategy,  the  opinion  score  of  a  blog  feed  is 
calculated  as  follows: 


Scoreopin(Feed) 


X postEFeedJ °P  ScOVeopin  (post) 


\FeedTop\ 

We  simply  compute  the  blog  feed’s  factual  score  through 
the  following  equation. 

Scorefact  (Feed)  =  —  Score opin(Feed) 


Maximum  Strategy  (MS) 

Under  this  strategy,  the  opinion  score  of  a  blog  feed  is 
calculated  as  follows: 

Score0pin(Feed)  =  MaxposteFeedToP(post) 

This  essential  idea  comes  from  the  IR  domain  where  the 
most  topic-relevant  topic  of  a  document  is  regarded  as 
the  topic  that  this  document  talks  about.  Thus ,  we  regard 
the  maximum  opinion  score  of  the  posts  as  this  feed’s 
opinion  score. 

We  also  compute  the  blog  feed’s  factual  score  through 
the  following  equation  like  the  Average  Strategy. 
Scorefact(Feed)  =  —Scoreopin(Feed) 

Finally,  we  need  to  combine  the  blog  feed’s  topic¬ 
relevant  score  and  opinion/factual  score  to  generate  this 
feed’s  final  ranking  score.  This  ranking  score  should 
consider  the  topic-relevance  and  the  opinion/factual  facet 
inclination.  It  can  be  calculated  as  follows: 

Score(Feed) 

=  ScoreR(Feed)p  x  Scoreopin/ fact(F  eed)x~p 


where  p  is  the  parameter. 

3.  Opinion  Retrieval  Model  (ORM) 

In  this  section,  we  propose  a  novel  opinion  retrieval 
model.  Following  [12],  the  score  of  a  blog  post  reflects 
not  only  the  topic-relevance,  but  also  the  opinion/factual 
facet,  and  it  can  be  formulated  as  follows: 

Score  ( post\opin ,  Q) 
oc  Score(post,  opin,  Q) 

=  p(post)p(Q\post)P(opin\Q,post) 

Score(post\fact,Q) 
tx  Score(post,  fact,  Q) 

=  p(post)p(Q\post)P(fact\Q,post) 

=  p(post)p(Q\post)(l  —  P(opin\Q,post)) 

We  can  see  two  components  in  the  above  formula: 
p(post)p(Q  |  post)  which  considers  the  topic -relevant 
degree  and  P(opin\Q,post)  which  deals  with  its 
opinionated  degree.  Since  the  first  component  can  be 
calculated  through  the  classic  language  model,  we  only 
need  to  compute  P(opin\Q, post)  ,  and  it  can  be 
calculated  as  follows: 

P(opin\Q,post)  =  y  co(s,Q\w)f 

seopin 

Where  s  is  any  sentiment  word  in  the  above  sentiment 
lexicon,  co(s,Q\w)f  is  the  frequency  of  the  sentiment 
word  .v  which  is  co-occurred  with  any  query  word  of  Q 
within  a  window  of  W. 

Similar  to  3.1.2,  the  blog  feed’s  ranking  score  can  be 
obtained  through  the  following  formula: 

Score(Feed) 

=  (MaxposteFeedToP(Score(post\opin/ f  act,  Q))) 
\FeedTop\ 

X  \Feed\ 

3.2.2.  Personal  vs.  Official  Model 

For  the  personal/official  task,  we  select  three  features  to 
identify  the  personal/official  facet:  the  existence  of 
offensive  words,  the  proportion  of  personal  pronouns, 
and  the  maximum  named  entity  proportion.  Firstly,  we 
believe  that  a  blog  where  strongly  offensive  word 
appears  is  less  likely  to  be  an  official  one.  For  those 


feeds,  the  respective  feeds  are  multiplied  with  a  very 
large  penalty  multiplier;  therefore  they  are  less  likely  to 
appear  in  the  top  of  the  result  list.  Secondly,  official 
blogs  tend  to  use  plural  forms  of  personal  pronouns,  such 
as  ‘ we  ‘our  ’  to  refer  to  the  organization,  while  personal 
blogs  tend  to  use  single  forms  of  personal  pronouns,  for 
example  We  calculate  the  proportion  of  the  two 
kinds  of  personal  pronouns  above  as  a  feature.  Finally, 
the  most  obvious  feature  of  an  official  blog  is  the 
frequent  appearance  of  one  same  named  entity.  Similar 
method  has  been  previously  used  in  [4].  We  use  Stanford 
NER  to  tag  the  named  entities  and  then  sorted  the 
proportion  of  all  named  entities,  and  then  select  the 
maximum  one  as  another  feature.  We  use  a  lower  bound 
and  an  upper  bound  to  the  proportion  value,  a  proportion 
p  is  set  to  the  nearer  bound  if  it  exceeds  the  interval 
(m in-proportion,  max-proportion). 

Then,  the  facet  score  of  a  blog  feed  is  formulated  as 
follows: 


I 


posteFeedT°P 


Content(post ) 


Score 


official 


|  FeedT°P\ 
x  Penalty 
Score 


C Feed ) 

x  NEc(c)xPcx(cypcp(cy-x-y 


personal 

1 


(feed.) 

x  NEc(c)-xPcx(c)-ypcp(c)-^-x-y^> 


\FeedTop\ 
x  Penalty 

where  NEC  is  the  maximum  named  entity  proportion, 
PCS  is  the  proportion  of  singular  forms  of  first  personal 
pronouns,  PCP  is  the  proportion  of  plural  forms  of  first 
personal  pronouns,  x,  y  are  the  parameters,  and  Penalty 
is  directly  proportional  to  the  average  numbers  of 
strongly  offensive  words  in  each  blog.  We  point  out  that 
smaller  penalty  is  applied  when  calculate  the  personal 
facet.  Finally,  we  used  a  similar  method  to  combine  the 
facet  score  and  the  topic-relevant  score  with  in  3.2.1. 
3.2.3.  In-depth  vs.  Shallow  Model 
To  establish  the  In-depth/shallow  facet  analysis  model, 


3http://nlp. stanford.edu/softwa  re/CRF-NER.shtml 


we  consider  the  proportion  of  regularly  built  words  and 
the  proportion  of  long  words  as  our  features  to  identify 
the  in-depth/facet  inclination.  According  to  the 
word-building  rules,  most  words  used  to  describe  simple 
stuffs  and  activities  in  daily  life  are  short  and  irregular. 
To  describe  profound  and  abstract  things,  we  often  use 
those  words  built  according  to  some  special  rules,  such 
as  those  words  ending  with  ‘tion  ‘ous  ’  or  ‘ly  If  a  blog 
has  a  high  proportion  of  this  kind  of  words,  it  is  very 
possible  that  this  blog  expresses  ‘ in-depth  ’  topics  rather 
than  making  simple  descriptions.  So  we  regard  the 
proportion  of  these  words  as  an  important  feature. 

On  the  other  hand,  it  is  easy  to  make  a  hypothesis  that 
longer  words  carry  deeper  and  more  complicated 
meanings.  So  we  calculate  the  words  which  contain  more 
than  8  letters  and  got  the  proportion  as  our  second 
feature. 

The  in-depth  score  of  one  blog  post  is  formulated  as 
follows: 

ScoreIn_depth(post)  =  PA  x  p  +  PL  x  (1  —  p) 
where  PA  represents  the  proportion  of  regularly  built 
words,  PL  stands  for  the  proportion  of  long  words,  and 
p  is  the  parameter. 

Then,  like  the  idea  of  3.1.2,  we  also  adopt  two  different 
strategies  to  calculate  the  in-depth/shallow  facet  score  of 
the  blog  feed. 

Average  Strategy  (AS) 

Scoreiy^—dgptfo  (Feed) 

_  £ 'ipostEFeedT°P  Score [n—  depth.(.POSt) 

~  \FeedTop\ 

Scoreshallow  (Feed)  Scorein_dep^ x  (Feed) 

Maximum  Strategy  (MS) 

ScoreIn_depth  (Feed) 

MaxposteFeedToPScoreIn-depth  (post) 

Scoreshallow  (Feed)  Scorejn_depf:fl  (Feed) 

Finally,  the  idea  of  combining  the  topic-relevant  score 
and  in-depth/shallow  score  is  also  the  same  as  in  3.2.1. 

4.  Result  Analysis 

In  this  section,  we  analyze  the  results  of  our  approaches. 
Our  approaches  are  evaluated  on  the  new  topics  of 
TREC  2010.  In  the  baseline  sub-task,  46  new  topics  are 


evaluated  and  31  new  topics  are  evaluated  in  the  facet 
inclination  sub-tasks  [7]. 

Table  1  provides  the  performance  values  of  our  own  two 
baselines.  We  can  see  that  the  query  expansion  method 
applied  on  PKUTMB2  is  effective.  The  little  lower  of 
R-prec  value  may  due  to  that  the  method  cannot  improve 
the  precision  value  which  is  the  generic  problem  of  most 
existing  query  expansion  methods  [5]. 

The  opinion/factual  results  are  shown  in  Table  2.  We  can 
see  that  for  the  opinion  facet,  the  Average  Strategy 
performs  much  better  than  the  others.  However,  for  the 
factual  feat,  the  Maximum  Strategy  performs  better,  and 
our  opinion/facet  models  are  more  suitable  to  the  opinion 
sub-task  than  to  the  factual  sub-task.  We  simply  use  the 
topic-relevant  score  of  stdbaseline  directly  in  this  task. 
Maybe  the  different  topic-relevant  algorithms  of 
stdbaselines  result  in  the  insignificant  improvements 
over  these  stdbaselines.  Besides,  the  insignificant 
improvements  of  ORM  are  possibly  due  to  the  limitation 
of  the  window  size  W. 

Table  3  illustrates  the  personal/official  results.  In  most 


cases,  our  result  shows  a  higher  MAP  than  the  baselines, 
which  proves  the  effectiveness  of  our  method.  However 
we  can  find  obvious  instability  between  different 
baselines.  This  also  may  be  a  result  of  the  differences 
between  topic -relevant  algorithms.  Those  lower  than  the 
baseline  results  should  be  a  consequence  of 
randomization  effect,  which  may  come  from  the 
following  aspects:  the  possibly  inclusion  of  posts  or  lack 
of  main  blog  text,  the  non-target  effect  of  the  algorithm, 
the  limitation  of  standard  tags,  the  simple  multiply 
combination  of  parameters. 

Table  4  provides  the  results  of  in-depth/shallow  facet, 
which  illustrates  that  our  system  works  well  on  our  own 
baseline.  However,  it  seems  not  very  effective  on  the 
stdbaselines  like  the  above  two  facet  sub-tasks.  It  reveals 
that  our  algorithm  is  effective  but  has  some 
disadvantages  as  well.  Besides,  our  system  strongly 
related  to  several  parameters  and  some  of  them  needed  to 
be  changed  in  that  case.  The  results  prove  that  we  don’t 
make  reasonable  change. 


Tag 

MAP 

R-prec 

bpref 

P@10 

PKUTMB1 

0.2453 

0.2892 

0.2325 

0.3304 

PKUTMB2 

0.2537 

0.2882 

0.2403 

0.3435 

Table  1:  The  performance  of  our  baselines 


Tag 

Opinion  MAP 

Factual  MAP 

baseline 

AS 

MS 

ORM 

baseline 

AS 

MS 

ORM 

PKUTMB1 

0.1761 

0.2807 

0.1804 

0.1701 

0.2192 

0.1399 

0.2148 

0.1399 

PKUTMB2 

0.1619 

0.2758 

0.1740 

0.1553 

0.2150 

0.1394 

0.2124 

0.1394 

stdbaseline  1 

0.2598 

0.2608 

0.2603 

0.2693 

0.2705 

0.2761 

stdbaseline2 

0.1054 

0.1116 

0.1068 

0.2068 

0.2081 

0.2069 

stdbaseline3 

0.0768 

0.0700 

0.0723 

0.1660 

0.2344 

0.1566 

Table  2:  Opinion/Factual  MAP  results  over  five  baselines. 


Tag 

Personal  MAP 

Official  MAP 

baseline 

result 

baseline 

result 

PKUTMB1 

0.1470 

0.1636 

0.1820 

0.1930 

PKUTMB2 

0.1441 

0.1901 

0.1962 

0.1950 

stdbaselinel 

0.1377 

0.1575 

0.2439 

0.2507 

stdbaseline2 

0.0755 

0.0856 

0.1938 

0.1832 

stdbaseline3 

0.0900 

0.0653 

0.2014 

0.2127 

Table  3:  Official/Personal  MAP  results  over  five  baselines. 


Tag 

In-depth  MAP 

Shallow  MAP 

baseline 

MS 

AS 

baseline 

MS 

AS 

PKUTMB1 

0.1644 

0.2407 

0.2398 

0.1084 

0.0874 

0.0973 

PKUTMB2 

0.1533 

0.1733 

0.1729 

0.1005 

0.0966 

0.0979 

stdbaselinel 

0.2345 

0.0876 

0.0876 

0.1038 

0.0416 

0.0416 

stdbaseline2 

0.1309 

0.0662 

0.1066 

0.1259 

0.0528 

0.0528 

stdbaseline3 

0.0756 

0.0477 

0.0477 

0.0923 

0.0372 

0.0372 

Table  4:  In-depth/Shallow  MAP  results  over  five  baselines. 


5.  Conclusion  and  Future  Work 

In  this  paper,  we  present  the  PKUTM  system  for  the 
Faceted  Blog  Distillation  Task.  This  task  has  been 
usually  approached  as  a  two-stage  procedure  consisting 
of  baseline  stage  and  identifying  the  facet  inclination 
stage.  In  the  baseline  stage,  an  effective  approach  is 
proposed  to  extract  useful  words  from  the  topics  for 
query  expansion  which  can  improve  the  recall  value. 
Regarding  the  facet  inclination  stage,  several  heuristic 
methods  are  used.  The  experimental  results  show  these 
heuristic  methods  are  effective.  We  also  propose  a  novel 
opinion  retrieval  model  for  the  opinion/factual  facet 
inclination  sub-task.  Our  system  also  has  some  weak 
points  such  as  our  facet  models  do  not  perform  well  over 
the  stdbaselines.  In  the  future,  we  will  devote  to  explore 
models  which  are  more  robust. 
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